|Tools| <Jolt README | |Up to Jolt: Converting bytecode to C | |Java VM notes >

Jolt: Converting bytecode to C The basic idea


6.2 The basic idea

The translator reads a .class file, and generates equivalent C code for methods in the .class file. It also generates a new .class file that marks all the converted methods as being native. Whenever the C code needs to use the VM, it does so through the interface used by native methods. This approach has its drawbacks Very little optimization is attempted in this version, beyond cutting out all stack operations by directly addressing the stack and variables of the VM. The short story is that performance is poor in this implementation when methods or field values are accessed frequently. Performance is comparable to native C when neither of these are involved. Just to give a feel for what the generated C looks like, this micro benchmark that runs an empty loop:
public class xxy
{ public void doit(int cnt)
   { for (int i=0; i < cnt; i++)
       for (int j = 0; < cnt; j++)
          for (int k = 0; < cnt; k++); } }
(after compiling to bytecode and then translated to C) looks like
DEFUN(doit)
{
cp_item_type *_cp;
struct fieldblock *_fb;
struct methodblock *_mb;
int32_t var_0 = _P_[0].i;
int32_t var_1 = _P_[1].i;
int32_t var_2;
int32_t var_3;
int32_t var_4;
int32_t stack_1;
int32_t stack_2;

stack_1 = 0;
var_2 = stack_1;
goto jolt_label_36;
jolt_label_5:
stack_1 = 0;
var_3 = stack_1;
goto jolt_label_28;
jolt_label_10:
stack_1 = 0;
var_4 = stack_1;
goto jolt_label_19;
jolt_label_16:
var_4 += 1;
jolt_label_19:
stack_1 = var_4;
stack_2 = var_1;
if (stack_1 < stack_2)
  {goto jolt_label_16;}
var_3 += 1;
jolt_label_28:
stack_1 = var_3;
stack_2 = var_1;
if (stack_1 < stack_2)
  {goto jolt_label_10;}
var_2 += 1;
jolt_label_36:
stack_1 = var_2;
stack_2 = var_1;
if (stack_1 < stack_2)
  {goto jolt_label_5;}
return _P_;
method_exit: return _P_;
}
This translates into very efficient native code when compiled with optimizations, and the compiled (and dynamically linked) code runs in approximately 200ms, compared to 9000ms when the bytecode is interpreted, for the same arguments to the method. It is about the same for a similarly coded C loop.

Ok, big deal. How are real programs dealt with? Method calls are implemented through the hooks available to native methods. For instance, the canonical hello world program

public class xxy
{ public void doit(int cnt)
  { System.out.println("hello world"); } }
gets bytecoded into a field access followed by a method invocation
0 getstatic #7 < Field java.lang.System.out Ljava/io/PrintStream; >
3 ldc #1 < String "hello world" >
5 invokevirtual #8 < Method java.io.PrintStream.println(Ljava/lang/String;)V >
8 return
which is translated into

if (java_lang_System_out_field == -1)
{ if (JoltStaticField(JoltSelfRef,_cp, 7, _EE_) == FALSE)  goto method_exit;
  java_lang_System_out_field = 1; }
_fb = _cp[7].p;
stack_1 = (int32_t) *(OBJECT *)normal_static_address(_fb);

if (JoltConst_1 == -1)
{ if (JoltResolveConst(_cp, 1, _EE_) == FALSE)  goto method_exit;
  JoltConst_1 = 1; }

stack_2 = (int32_t) _cp[1].p;
if (java_io_PrintStream_println8_offset == -1)
{ if (JoltVirtualResolve(JoltSelfRef,_cp, 8, _EE_) == FALSE) goto method_exit;
   _mb = _cp[8].p;
   java_io_PrintStream_println8_offset = _mb->fb.u.offset; }
{ Java8 _tmp;
  _mb =
     mt_slot(obj_methodtable((Handle *) stack_1),
     java_io_PrintStream_println8_offset);
do_execute_java_method(_EE_,(void *)stack_1,NULL,NULL,_mb,FALSE,stack_2);}

if (exceptionOccurred(_EE_))  goto method_exit;
return _P_;
The first pass of the code transfers offsets/pointers to static variables in the C code, so that subsequent passes use the cached information. Needless to say, method invocations (partly due to the fact that do_execute_java_method() packs and unpacks vararg lists) slows down execution enough to make a long sequence of method invocations little different from an interpreted approach.

As another benchmark on the other spectrum, the compiled version of the translator is about 5% faster than the interpreted version when re-translating itself. This isn't surprising --- running the original bytecode under the profiler reveals that approximately 89% of the time is spent in writing out to files (essentially System.out.println() calls) which of course is not compiled to C by the translator. The second reason seems to be due to a significant portion of the remainder of the time spent in doing (recursive method calls) to determine the stack depth, which is not very well optimized by the translation to C.

Finally, feel free to experiment with benchmarks you have of course, just keep in mind the caveats imposed by the method and not draw too many conclusions either way. I'd love to see how they fare -- you are welcome to contact me if you can hand out the bytecode and are unwilling to mess around with this system.


|Tools| <Jolt README | |Up to Jolt: Converting bytecode to C | |Java VM notes >

KB Sriram
Comments, bug reports: kbs@sbktech.org

Revised: Sat May 25 10:18:34 1996
URL: http://www.sbktech.org/genidea.html